Introducing Urdu Digits Dataset with Demonstration of an Efficient and Robust Noisy Decoder-Based Pseudo Example Generator
نویسندگان
چکیده
In the present work, we propose a novel method utilizing only decoder for generation of pseudo-examples, which has shown great success in image classification tasks. The proposed is particularly constructive when data are limited quantity used semi-supervised learning (SSL) or few-shot (FSL). While most previous works have an autoencoder to improve performance SSL, using single may generate confusing pseudo-examples that could degrade classifier’s performance. On other hand, various models utilize encoder–decoder architecture sample can significantly increase computational overhead. To address issues mentioned above, efficient means generating by generator (decoder) network separately each class be effective both SSL and FSL. our approach, trained random noise, multiple samples generated decoder. Our generator-based approach outperforms state-of-the-art FSL approaches. addition, released Urdu digits dataset consisting 10,000 images, including 8000 training 2000 test images collected through three different methods purposes diversity. Furthermore, explored effectiveness on FSL, demonstrated improvement 3.04% 1.50% terms average accuracy, respectively, illustrating superiority compared current models.
منابع مشابه
An Efficient Discrete Log Pseudo Random Generator
The exponentiation function in a finite field of order p (a prime number) is believed to be a one-way function. It is well known that O(log log p) bits are simultaneously hard for this function. We consider a special case of this problem, the discrete logarithm with short exponents, which is also believed to be hard to compute. Under this intractibility assumption we show that discrete exponent...
متن کاملAn Efficient Mmse Source Decoder for Noisy Channels
Exploiting the residual redundancy in a source coder output stream during the decoding process has been proven to be a bandwidth efficient way to combat the noisy channel degradations [2]-[7]. Researchers have recently developed techniques to employ this redundancy to either assist the channel decoder for improved performance or design effective source decoders. However, the dominant method use...
متن کاملUCOM offline dataset-an urdu handwritten dataset generation
A benchmark database for character recognition is an essential part for efficient and robust development. Unfortunately, there is no comprehensive handwritten dataset for Urdu language that would be used to compare the state of the art techniques in the field of optical character recognition. In this paper, we present a new and publically available dataset comprising 600 pages of handwritten Ur...
متن کاملPronouncUR: An Urdu Pronunciation Lexicon Generator
State-of-the-art speech recognition systems rely heavily on three basic components: an acoustic model, a pronunciation lexicon and a language model. To build these components, a researcher needs linguistic as well as technical expertise, which is a barrier in lowresource domains. Techniques to construct these three components without having expert domain knowledge are in great demand. Urdu, des...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Symmetry
سال: 2022
ISSN: ['0865-4824', '2226-1877']
DOI: https://doi.org/10.3390/sym14101976